Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove source stage for amd-bootc #1

Conversation

fabiendupont
Copy link

The multi-stage build has too many stages. During the installation of the amggpu-dkms package, the modules are built and installed in /lib/modules/${KERNEL_VERSION}. If the installation of the package is done in the driver-toolkit image, the extra dependencies are very limited. This change removes the source stage and installs the amdgpu-dkms package on top of driver-toolkit.

The amdgpu-dkms packages installs the modules in /lib/modules/${KERNEL_VERSION}/extra and these are the only modules in that folder. The amdgpu-dkms-firmware package is installed as a dependency of admgpu-dkms and it installs the firwmare files in /lib/firmware/updates/amdgpu·. So, this change removes the in-tree amdgpu modules and firmware, then copies the ones generated by DKMS in the builder stage.

The change also moves the repository definitions to the repos.d folder and adds the AMD public key to verify the signatures of the AMD RPMs.

The users call a wrapper script called ilab to hide the instructlab container image and the command line options. This change copies the file from nvidia-bootc and adjusts the logic. The main change is that /dev/kfd and /dev/dri devices are passed to the container, instead of nvidia.com/gpu=all. The ilab wrapper is copied in the amd-bootc image.

The Makefile is also modified to reflect these changes.

training/amd-bootc/Containerfile Outdated Show resolved Hide resolved
training/amd-bootc/Containerfile Outdated Show resolved Hide resolved
training/amd-bootc/Containerfile Show resolved Hide resolved
The multi-stage build has too many stages. During the installation of
the `amggpu-dkms` package, the modules are built and installed in
`/lib/modules/${KERNEL_VERSION}`. If the installation of the package is
done in the `driver-toolkit` image, the extra dependencies are very
limited. This change removes the `source` stage and installs the
`amdgpu-dkms` package on top of `driver-toolkit`.

The `amdgpu-dkms` packages installs the modules in
`/lib/modules/${KERNEL_VERSION}/extra` and these are the only modules in
that folder. The `amdgpu-dkms-firmware` package is installed as a
dependency of `admgpu-dkms` and it installs the firwmare files in
`/lib/firmware/updates/amdgpu·`. So, this change removes the in-tree
`amdgpu` modules and firmware, then copies the ones generated by DKMS in
the `builder` stage.

The change also moves the repository definitions to the `repos.d` folder
and adds the AMD public key to verify the signatures of the AMD RPMs.

The users call a wrapper script called `ilab` to hide the `instructlab`
container image and the command line options. This change copies the
file from `nvidia-bootc` and adjusts the logic. The main change is that
`/dev/kfd` and `/dev/dri` devices are passed to the container, instead
of `nvidia.com/gpu=all`. The `ilab` wrapper is copied in the `amd-bootc`
image.

The Makefile is also modified to reflect these changes.

Signed-off-by: Fabien Dupont <[email protected]>
@fabiendupont fabiendupont force-pushed the amd-bootc-use-out-of-tree-drivers branch from 45d0a2c to d33c8cb Compare August 27, 2024 12:23
@yevgeny-shnaidman yevgeny-shnaidman merged commit 8c51b1b into yevgeny-shnaidman:yevgeny/amd-bootc-6.1.2 Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants